Skip to content

Conversation

jbtrystram
Copy link
Member

Instead of deploying the container to the tree then copy all the contents to the disk image, use bootc to directly manage the installation to the target filesystems.

Right now this requires to use the image as the buildroot so this requires python (for osbuild). This is tracked in [1].

[1] bootc-dev/bootc#1410 Requires osbuild/osbuild#2149

Copy link

openshift-ci bot commented Jul 17, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces changes to use bootc install to deploy the container, which simplifies the image build process. There are a few critical issues in the YAML manifest related to copy-paste errors that lead to incorrect configurations for the 4k image builds and missing options for loopback devices. These issues need to be addressed.

@dustymabe
Copy link
Member

dustymabe commented Jul 17, 2025

I switched the CI on this to run against rawhide (contains python) so we could actually test the change.

@dustymabe
Copy link
Member

A few diffs picked up by cosa diff --metal from #4226

cosa-diff-metal.txt

We should probably profile each diff (maybe in coreos/fedora-coreos-tracker#1827) and evaluate whether it's a change we want to make or not.

@dustymabe
Copy link
Member

I can't get a built qemu image to boot. I suspect probably the root= and boot= UUIDs added on the kernel command line?

@jbtrystram
Copy link
Member Author

I can't get a built qemu image to boot. I suspect probably the root= and boot= UUIDs added on the kernel command line?

do you mind sharing more logs ? What I am getting locally is ignition failing on coreos/fedora-coreos-tracker#1250

@dustymabe
Copy link
Member

Ahh. I see that too now:

[    4.726843] ignition[875]: Ignition failed: failed to create users/groups: failed to configure users: failed to create user "core": exit status 10: Cmd: "useradd" "--root" "/sysroot" "--create-home" "--password" "*" "--comment" "CoreOS Admin" "--groups" "adm,sudo,systemd-journal,wheel" "core" Stdout: "" Stderr: "useradd: cannot lock /etc/group; try again later.\n"

@jbtrystram

This comment was marked as outdated.

@jbtrystram

This comment was marked as outdated.

@jbtrystram
Copy link
Member Author

I can't get a built qemu image to boot. I suspect probably the root= and boot= UUIDs added on the kernel command line?

looks like removing those make the boot process go further (ignition completes), and out of the initramfs but fail to mount the boot partition.

@jbtrystram
Copy link
Member Author

Blocked on bootc-dev/bootc#1441

@jbtrystram
Copy link
Member Author

ok this works with the following PRs :

for the bootc PR, it can be built then added into the image through overrides/rootfs. Make sure to build rawhide.

@jbtrystram
Copy link
Member Author

follow-up : either find a way to get the boot components inside cosa, or change the bootc code to call bootupd from the deployed root . I think the latter is preferable.
I filed bootc-dev/bootc#1455

@jbtrystram
Copy link
Member Author

jbtrystram commented Jul 29, 2025

follow-up : either find a way to get the boot components inside cosa, or change the bootc code to call bootupd from the deployed root . I think the latter is preferable. I filed bootc-dev/bootc#1455

Made bootc-dev/bootc#1460
With this, we no longer require to use the container as the buildroot, cosa works, so we could do that on all streams.

@jbtrystram jbtrystram force-pushed the osbuild-bootc-install-fs branch 4 times, most recently from bb4270f to 310bd60 Compare July 30, 2025 07:38
@jbtrystram
Copy link
Member Author

Alright, marking this as ready for review as all the bits are in place.
I guess i need to update the osbuild manifest or the other arches as well, but I'll do that after a review to reduce the amount of back and forth.

This will need a release of bootc.

@jbtrystram jbtrystram marked this pull request as ready for review July 30, 2025 07:44
Copy link
Member

@dustymabe dustymabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.

I think there are a few things we need to iron out before we can really move forward with this:

  1. supporting both old and new paths at the same time

Do we need to? Usually when we make a change this large we roll it out slowly, which means we have to support both ways for some time.

This PR is ignoring that fact, but TBH looking at OSBuild configs that support both would be pretty intimidating, so I'm not excited about trying to do that either. I'd be interested in @jlebon or @travier's thoughts.

  1. We need to make sure any/every diff that exists between images generated this way and the old way are considered and acknowleged as acceptable before we'd make this change.

@jlebon
Copy link
Member

jlebon commented Aug 1, 2025

Some comments.

I think there are a few things we need to iron out before we can really move forward with this:

1. supporting both old and new paths at the same time

Do we need to? Usually when we make a change this large we roll it out slowly, which means we have to support both ways for some time.

This PR is ignoring that fact, but TBH looking at OSBuild configs that support both would be pretty intimidating, so I'm not excited about trying to do that either. I'd be interested in @jlebon or @travier's thoughts.

Rolling this out in FCOS first would be preferable indeed. Instead of adding mpp conditionals, could we just have a separate set of manifests to use temporarily? Not ideal either, I know.

If we're sufficiently convinced by the diff, we could do a hard cut over. The problem is in ensuring the diff captures everything well. I think it helps a lot though that the main changes here are happening at the filesystem level.

@jbtrystram jbtrystram force-pushed the osbuild-bootc-install-fs branch 3 times, most recently from 48a30b4 to 77121d0 Compare August 28, 2025 15:44
@jbtrystram
Copy link
Member Author

Rolling this out in FCOS first would be preferable indeed. Instead of adding mpp conditionals, could we just have a separate set of manifests to use temporarily? Not ideal either, I know.

I went up this route : duplicated the manifest and added a switch to the osbuild wrapper to use it when releasever is >44. We can drop this when we are confident it can go to all streams.

I cleaned up the commits a bit and incorporated @dustymabe reviews suggestions.
If that approach looks good i'll generalize to other arches than x86 (or maybe not?) in another commit.

@jbtrystram
Copy link
Member Author

jbtrystram commented Aug 28, 2025

Ok I found another quirk : bootc-dev/bootc#1559

A workaround would be to use the container as the osbuild buildroot (but that would only work with rawhide as we need python). This is what I was doing until now so that's why I dind't ran into it before.

But since we are only using this manifest for rawhide for now, why not..

@jbtrystram
Copy link
Member Author

requires coreos/bootupd#990

@jbtrystram jbtrystram force-pushed the osbuild-bootc-install-fs branch from 77121d0 to 9037fa7 Compare September 8, 2025 09:40
The log disk usage message comming every 10 seconds is quite noisy,
hide it by default. We can always uncomment it to investigate.

I aslo added a couple of helpful tips in comments given by @dustymabe
to work with osbuild.
Bootc is looking for the prepare-root config file in the buildroot
environnement because the main assumption is that it's run from the
target container.
However, in osbuild, it's run from te buildroot, because podman inside
bwrap (inside supermin in our case) causes issues.
It's fine for RHCOS and SCOS where we use the target container as the
buildroot but we cannot do that for FCOS because we require python in
the buildroot.

For now, insert a prepare-root file in the supermin VM (use as the
buildroot for osbuild) until either :
- bootc learn to look into the container for it [1]
- we ship python in our images and can use them as buildroot.

Another approach would be to layer python and the osbuild dependencies
on top of our image and use that as the buildroot, but that would create
room for packages drift (what was in the repos at build time?). At least
using COSA it's easier to keep track of versions.

[1] bootc-dev/bootc#1410
Instead of deploying the container to the tree then copy all the contents
to the disk image, use bootc to directly manage the installation to the
target filesystems.

Right now this requires to use the image as the buildroot so this
requires python (for osbuild). This is tracked in [1].
As we have python in rawhide now I duplicated the manifest and added a
switch in the osbuild wrapper script.

We can keep the manifest duplicated until we are confident to roll this
to all streams.

[1] bootc-dev/bootc#1410

Requires:
bootc-dev/bootc#1460
bootc-dev/bootc#1451
osbuild/osbuild#2149
osbuild/osbuild#2152

All of which have landed in osbuild-159 and bootc 1.6
@jbtrystram jbtrystram force-pushed the osbuild-bootc-install-fs branch from 9037fa7 to d54d764 Compare September 8, 2025 09:40
@jbtrystram
Copy link
Member Author

jbtrystram commented Sep 8, 2025

Rebased and removed osbuild patches as they have landed in an upstream release now (requires a COSA rebuild though)

Copy link

openshift-ci bot commented Sep 8, 2025

@jbtrystram: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/images d54d764 link true /test images
ci/prow/rhcos d54d764 link true /test rhcos

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants